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A GROUP MEASUREMENT INSTRUMENT WAS CONSTRUCTED TO TEST 
THE DEVELOPMENT OF CRITICAL THINKING IN YOUNG CHILDREN. 
DESIGNED TO ELICIT CHOICES OF THE MOST SATISFACTORY OF 3 
ALTERNATIVE CONCLUSIONS FOR EACH OF 26 INCOMPLETE STORIES 
PRESENTED IN CARTOON PANEL FORMAT* THE INSTRUMENT MEASURES 
THE DEVELOPMENT OF THE 4 PIAGETIAN CONCEPTS OF CONSERVATION* 
CAUSALITY* RELATIONS* AND LOGIC. THE TEST WAS INITIALLY 
ADMINISTERED TO 1*972 CHILDREN IN THE BOULDER (COLORADO) 
SCHOOL DISTRICT. HOMOGENEITY RATIOS* RELIABILITY 
COEFFICIENTS* AND INTERITEM CONSISTENCIES WERE COMPUTED FOR 
EACH OF THE CONCEPTUAL SCALES. THE LOGIC AND RELATIONS SCALES 
APPEARED TO CONTAIN ITEMS WHICH WERE TOO DISSIMILAR AND* AS A 
RESULT* FAILED TO CLUSTER SATISFACTORILY. THE CONSERVATION 
AND CAUSALITY SCALES* HOWEVER* EXHIBITED STATISTICALLY 
SIGNIFICANT INTERITEM CONSISTENCIES. MEAN SCORES ON THE 
CONSERVATION AND CAUSALITY SCALES WERE COMPUTED AS A FUNCTION 
OF CHRONOLOGICAL AGE* AND AGE-RELATED TRENDS WERE FOUND TO 
EXIST IN THESE AREAS. THE RESULTS THUS FAR SUGGEST THAT THIS 
GROUP MEASUREMENT INSTRUMENT IS A PROMISING MEANS FOR 
OBTAINING INFORMATION WHICH HAS HERETOFORE BEEN OBTAINED ONLY 
IN CLINICAL TESTING WHICH INVOLVES THE EXTENSIVE INTERVIEWING 
OF EACH SUBJECT. THIS PAPER WAS PRESENTED AT THE 1967 ANNUAL 
CONVENTION OF THE NATIONAL SCIENCE TEACHERS ASSOCIATION. (JS) 
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introduction 



In the Boulder School District we have under way a project to modify our 
elementary science program by making extensive use of the products of various 
curriculum development projects. Among several outcomes we are particularly 
interested in what effect, if any, these changes have on children's ability to 
think critically. Critical thinking in high school age or older persons has been 
measured successfully, most notably by the Watson-Glaser tests (l9o4), but when 
the question of critical thinking by younger children is raised there is no such 
ready guideline for defining and measuring critical thinking. In addition, the 
recently much discussed contributions of Jean Piaget seem to suggest that crit- 
ical thinking of younger children differs not only quantitatively but also 
qualitatively from the Watson-Glaser formulations. 

Such considerations have led us to attempt to design a group measure of 
the thought processes which might be seen as the developmental precursors to the 
reasoning required in critical thinking. This has necessitated the adoption of 
certain Piagetian conceptual reference frames, but at the same time practical 
considerations have forced the development of radically different methodology. 

As is well known the experimental work of Piaget is founded on the "clinical 
method" (1963) which involves the extensive interviewing of each subject. One 
of the immediate concerns regarding Piaget's use of the clinical method is that 
both the test and inquiry procedure is quite variable for each subject and highly 
dependent on the skill and sensitivity of the Interviewer. Though we would agree 
with Piaget on the richness of the information that this approach yields, the 
method was not seen to be feasible for the present purpose because of the large 
numbers of children involved. 

The task of developing a group measure of children's ability to deal with 
Piaget type tasks was undertaken. A group of thirty or so situations, drawn 
from Piaget experiments, were identified and subsequently restructured into a 
cartoon panel format similar to the example on the following page. In this car^ 
toon two boys are engaged in conversation and in actions relating to ^ the task. 

The story is incomplete and the task for the child is to indicate which of the 
three choices best completes the story. The present form of the test contains 
twenty-six such items which represented four conceptual classes of tasks. 

in the testing situation the reading difficulties were overcome by having 
the test administrator read the captions while pointing to a projected image of 
page while the children followed In their test booklets. 



Conceptual Design 

Four domains of thought seemed particularly germane to the development of 
critical thinking. These four concepts were conservation , causality , relations 
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and logic and were contained as sgb-scales in the first form of the test. These 
scales are described in the following. 

1. Conservation: The concept of conservation, as described by Piaget, is 

divided into five distinct types, (1) conservation of substance, (2) 
conservation of number, (3) conservation of volume, (4) conservation 
of distance, and (5) conservation of surface. The concept of con- 
servation requires that the recognition of transformations of location, 
shape, position and so on are not related to changes in the amount of 
substance, distance, or volume in question. For e)(ample, changing the 
shape of a ball of clay from a ball to a pancake does not alter the 
amount of clay originally contained in the ball despite the obvious 
changes in dimensions. 

2. Causality: Children are reported to interpret reality in ways different 

from the objective and mechanistic way of adults. These diverse ways 
are called precausal explanations and included are animism, dynamism 
artificial ism among others. For example, a child who reasons pre- 
causal ly might well explain the fact that smoke tends to rise on 
animistic grounds. That is, smoke goes up because "it wants to." 

3. Relations: The concept of relations has to do with the child's ability 

to perceive the relative nature of observations. That is, whether an 
object is on the right or left side of another object is contingent on 
the observer's point of view. Other relations such as family relations 
and order relations also are dependent on the reference frame. For 
example, whether a person is a father or a son is relative to the 
person being made reference to in a particular context. 

4. Logic: The logic scale consisted of class logic items and items which 

depended upon the transitive property of the greater than relation. In 
addition to Piaget (1958) the work of Innis (1964) was used as a guide 
in the structure of the items. 
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RESULTS 



Test Analysis - 1,972 tests were scored from the first administration. 
Data was subjected to a factor analysis utilizing the BC-TRY (Tryon, 19^6) 
system. The BC-TRY cluster analysis is an empirical method for determining 
the interitem consistency within a set of items which go to make up a factor, 
cluster or scale. The system was thus used to empirically determine the 
interitem consistency within our conceptual scales. Two of the conceptual 
scales, namely, conservation and causality, appeared to hold together. The 
logic and relations scales, while showing some internal consistency, appeared 
to contain items which were too dissimilar and, as a result, failed to cluster 
satisfactorily. A fifth scale, called the residual scale, appeared in the 
empirical analysis, yet did not show any conceptual consistency. The follow- 
ing table summarizes some of the statistical properties of each scale and the 
total test. 



Scales 


Homogeneity 

Ratios 


Reliability 

Coefficient 


Mean 

Score 


Standard 

Deviation 


Number 
of Items 


Conservation 


0.276 


0.694 


.71 


.28 


6 


Causality 


0.239 


0.550 


.86 


.23 


4 


Relations 


0.001 


0.001 


.41 


.33 


2 


Logic 


0.047 


0.227 


.50 


.22 


6 


Residual 


0.120 


0.350 


.62 


.26 


4 


Total Test 


0.105 


0.717 


.64 


.17 


22 



As can be seen the conservation scale had six items, a homogeneity ratio 
of .276, a reliability of .694, a mean score of .71 and a standard deviation 
of .28. The homogeneity ratio is based on Scott (I960) and provides a measure 
of the internal consistency of the conceptual scales as contrasted with the 
purely empirically determined scales derived by the BC-TRY. The reliability 
measure is Chronbach's alpha (1951) and provides a measure of reliability 
which is expressed as the mean of the split-half coefficients which are com- 
puted for all possible divisions of the test or scale into two parts. 
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Age ‘Normative Findings . Mean scores on the conservation and causality 
scales were computed as a function of chronological age to see if developmental 
trends existed. Figure 1 shows the percentage of students who egualled or 
exceeded the criterion score for conservation of substance. The curve indi** 
cates that the conservation of substance concept develops in the primary grades, 
however, approximately twenty percent of the youngest children tested appeared 
to have the concept while approximately twenty percent of the oldest children 
tested did not. 



100 



90 

80 

70 

60 

50 

40 

30 

20 

10 



/\ I 



/ 



f 






• ' • / 

• V j . / 



First Grade Ages 



\ / 

V 



/-./\ / ‘X; 

\/ 



7, 

/\ 



/ -xy 






Third Grade Ages 



Second Grade Ages 



I --I- 



80 



i i I ■■■ -A 



90 



100 






110 



Figure 1 - Mean Scores on Conservation of Substance 
Scale as a Function of Age in Months. 



^The grade level markings, indicate normal age -in grade at the time the test was 
administered. 
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Figure 2 shows the incidence of selection of precausal forms of explana* 
tion. Only 14% of the responses used precausal explanations but there were age 
differences in the incidence and type of selection. Final istic precausal ex- 
planations were selected most frequently followed by dynamistic, animistic, ar- 
tificial istic and realistic forms. All showed a decrease in use with age except 
for dynamism in which a slight increase with age can be detected. 

I ndex * 




Animism 




Age in Months 





Figure 2 - incidence of Selection of Precausal Explanation by 

Type as a Function of Age 

*An index of 1.00 would mean every child of a given age in months picked precausal 
explanations of a particular type each time that type was available as a response. 



DISCUSSION 



The Information derived thus far indicates that the test technique Is 
a promising means for obtaining information about certain °T ®?, 

vidual child's thinking which has been obtained heretofor only clinically. 

The age normative data obtained on the acquisition of conservation 
agrees substantially with that reported by Lovell (1962) and 
population tested, 87 months seemed to be the age at which the chii^en 

had acquired the conservation concept. Of equal Interest especially 
teachers is the finding that in a first grade class (at least within our Pop- 
ulation) three out of ten children might be expected to have this 'oncept, n 
the second grade, six out of ten might be expected to have the concept, while 
in the third grade only eight out of ten children can be ^pected 
concept. This suggests that to the extent that conservation is an 
of the child's capability to solve problems by inversion or by c^pensation, 
a given primary class will be divided in their instructional "^^s. Infor- 
mation provided by a means similar to this test should be of considerabl 
diagnostic help to teachers In organizing such a heterogeneous class for 
instruction. 

The order of extinction of precausal forms of !® 

portive of the recent description of this development by Piaget (1967) in which 

he suggests that sequential stages characterize 

mechanistic explanations characterize children s thinking. The ‘1“®"***®*’ * J' 
smaller amounts of precausal forms of exeanatlo" fe"** L 

to the picture test methodology. The child s actual precause i,« 

not be present in the limited number of available choiM® f"** ®* ® re®“'* "« 
would tend to select the one sounding most like the ®?e* Jof 

tendency to choose an adult sounding explanation probably masks the extent of 

precausal explanations actually present. 
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